Data Embedding in Text for a Copier System

نویسندگان

Anoop K. Bhattacharjya

Hakan Ancin

چکیده

In this paper, we present a scheme for embedding data in copies (color or monochrome) of predominantly text pages that may also contain color images or graphics. Embedding data imperceptibly in documents or images is a key ingredient of watermarking and data hiding schemes. It is comparatively easy to hide a signal in natural images since the human visual system is less sensitive to signals embedded in noisy image regions containing high spatial frequencies. In other instances, e.g., simple graphics or monochrome text documents, additional constraints need to be satisfied to embed signals imperceptibly. Data may be embedded imperceptibly in printed text by altering some measurable property of a font such as position of a character or font size. This scheme however, is not very useful for embedding data in copies of text pages, as that would require accurate text segmentation and possibly optical character recognition, both of which would deteriorate the error rate performance of the data-embedding system considerably. Similarly, other schemes that alter pixels on text boundaries have poor performance due to boundarydetection uncertainties introduced by scanner noise, sampling and blurring. The scheme presented in this paper ameliorates the above problems by using a textregion based embedding approach. Since the bulk of documents reproduced today contain black on white text, this data-embedding scheme can form a print-level layer in applications such as copy tracking and annotation.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Joint Semantic Vector Representation Model for Text Clustering and Classification

Text clustering and classification are two main tasks of text mining. Feature selection plays the key role in the quality of the clustering and classification results. Although word-based features such as term frequency-inverse document frequency (TF-IDF) vectors have been widely used in different applications, their shortcoming in capturing semantic concepts of text motivated researches to use...

متن کامل

A New Document Embedding Method for News Classification

Abstract- Text classification is one of the main tasks of natural language processing (NLP). In this task, documents are classified into pre-defined categories. There is lots of news spreading on the web. A text classifier can categorize news automatically and this facilitates and accelerates access to the news. The first step in text classification is to represent documents in a suitable way t...

متن کامل

Textfax-Principle for new tools in the office of the future by WOLFGANG HORAK and WALTER WOBORSCHIL

By taking a closer look at today's office, we observe the following trend: The conventional typewriter is gradually being replaced by word-processors. These may merely be electric typewriters with a storage added or they may take on the form of highly sophisticated CRT workstations featuring screens carrying an entire standard size page and exchangeable storage media. These systems, which origi...

متن کامل

Investigating Embedded Question Reuse in Question Answering

The investigation presented in this paper is a novel method in question answering (QA) that enables a QA system to gain performance through reuse of information in the answer to one question to answer another related question. Our analysis shows that a pair of question in a general open domain QA can have embedding relation through their mentions of noun phrase expressions. We present methods f...

متن کامل

Model Based Method for Determining the Minimum Embedding Dimension from Solar Activity Chaotic Time Series

Predicting future behavior of chaotic time series system is a challenging area in the literature of nonlinear systems. The prediction's accuracy of chaotic time series is extremely dependent on the model and the learning algorithm. On the other hand the cyclic solar activity as one of the natural chaotic systems has significant effects on earth, climate, satellites and space missions. Several m...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 1999

Data Embedding in Text for a Copier System

نویسندگان

چکیده

منابع مشابه

A Joint Semantic Vector Representation Model for Text Clustering and Classification

A New Document Embedding Method for News Classification

Textfax-Principle for new tools in the office of the future by WOLFGANG HORAK and WALTER WOBORSCHIL

Investigating Embedded Question Reuse in Question Answering

Model Based Method for Determining the Minimum Embedding Dimension from Solar Activity Chaotic Time Series

عنوان ژورنال:

اشتراک گذاری